lUCAA 02/2003 



A faster implementation of the hierarchical search algorithm for 
detection of gravitational waves from inspiraling compact binaries 

Anand S. Sengupta and Sanjeev Dhurandhar 
Inter- University Centre for Astronomy and Astrophysics, 
Post Bag 4, Ganeshkhind, Pune 4^1007, India 

Albert Lazzarini 
LIGO Laboratory, California Institute of Technology, 
MS 18-34, Pasadena, CA 91125, United States 



1 



Abstract 

The first scientific runs of kilometer scale laser interferometric detectors like LIGO are underway. 
Data from these detectors will be used to look for signatures of gravitational waves (GW) from 
astrophysical objects like inspiraling neutron star/blackhole binaries using matched filtering. The 
computational resources required for online fiat-search implementation of the matched filtering are 
large if searches arc carried out for small total mass. Flat search is implemented by constructing a 
single discrete grid of densely populated template waveforms spanning the dynamical parameters 
- masses, spins - which are correlated with the interferometer data. The correlations over the 
kinematical parameters can be maximized apriori without constructing a template bank over them. 
Mohanty and Dhurandhar (1996) showed that a significant reduction in computational resources 
can be accomplished by using a hierarchy of such template banks where candidate events triggered 
by a sparsely populated grid is followed up by the regular, dense flat search grid. The estimated 
speed up in this method was a factor ~ 25 over the fiat search. In this paper we report an improved 
implementation of the hierarchical search, wherein we extend the domain of hierarchy to an extra 
dimension - namely the time of arrival of the signal in the bandwidth of the interferometer. This 
is accomplished by lowering the Nyquist sampling rate of the signal in the trigger stage. We show 
that this leads to further improvement in the efficiency of data analysis and speeds up the online 
computation by a factor of ~ 65 — 70 over the flat search. We also take into account and discuss 
issues related to template placement, trigger thresholds and other peculiar problems that do not 
arise in earlier implementation schemes of the hierarchical search. We present simulation results 
for 2PN waveforms embedded in the noise expected for initial LIGO detectors. 

PACS numbers: 04.80.Nn, 07.05.Kf, 95.55.Yni, 97.80.-d 
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I. INTRODUCTION 



Pulsar timing experiments by Hulse and Taylor [1, 2] led to accurate measurement of 
periastron time shifts in the PSR1913+16 binary pulsar system. These matched to the 
predictions from the theory of general relativity for energy losses due to gravitational waves 
(GW) to better than 0.5% accuracy. This was the first strong indirect evidence for the 
existence of GW. Over the last decade or so, a host of laser interferometric detectors like 
the two LIGO detectors [3] in USA, GEO600 in Germany [4], the TAMA 300 in Japan [5] 
and VIRGO in Italy-Prance [6] are being built to catch these waves in flesh and unravel 
the physics of the universe encoded in them. As of present, several of these interferometers 
are in advanced stage of completion. The TAMA 300 has achieved significant observation 
time with more than 1000 hours of observation whereas the LIGO had its first successful 
scientific run in August, 2002. Thus data from these detectors promise much interesting and 
new astronomical research over the next several years. 

The most well studied sources of GW important to detection are compact astrophysical 
objects moving at relativistic speeds, like merging blackholes, spinning neutron star systems, 
inspiraling neutron star binaries etc. [7, 10]. A gravitational wave burst with a typical GW 
strain of /i ~ 10~^^ is expected to carry [8] about 80 x 10~^ watts/meter^ past the modern 
detectors over a duration of ~ 10 miUisec. However, GW interact so weakly with matter 
that even this huge fiux of energy (by electromagnetic standards) will produce only a minute 
measurable signal of the above order in the arm lengths of the detectors. Detection of such 
weak signals requires special signal processing methods to extract the signature of these 
signals buried in the noisy output of these detectors. 

Accurate waveform modeling of gravitational waves from the inspiral phase of compact 
binaries [9] makes it possible to look for their signatures using the technique of matched 
filtering and are considered to be the hkely sources to be detected by the first generation of 
detectors. In the last few minutes of their inspiral, the frequency of the emitted GW will he 
in the sensitive bandwidth of these detectors. Matched filtering involves cross correlating 
the detector output with a bank of template waveforms each having a different set of pa- 
rameters. This is necessary since only the form of the expected signal is known and not its 
exact parameters a priori. Together, these templates span the whole range of parameters 
as determined from astrophysical considerations. This procedure is computationally less 
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efficient if implemented in a flat method which uses a very dense grid of templates to span 
the parameter space. The method is very well described in earlier literature - see [11-14] for 
example. A more efficient implementation of the flat search is desirable for several reasons. 
Eventually the interferometers will run continuously and real time analysis must be carried 
out in order to keep up with the data streams. This means that the time taken to analyze a 
data segment of duration T seconds, must be < T seconds in real time. By the same token, 
the computational efficiency of real time algorithms will limit the volume of the parameter 
space which one can search for GW signals in interferometer data. The desirability for the 
improved computational efficiency promised by the hierarchical algorithm presented here 
is twofold: (i) to date, there has been little or no optimization of the implementation of 
parallelized matched filtering algorithms and their performance will not scale with larger 
numbers of processors [16]; (ii) recent developments in the methodologies to handle more 
massive compact binary systems and to include non aligned spin have shown the need for a 
greater number of templates in a higher dimensional search space than has been applied in 
the first searches for lighter mass systems [23, 27]. 

A faster implementation of the matched filtering procedure was proposed by Mohanty and 
Dhurandhar [29, 30] which could reduce the computational cost of the search algorithm by 
a factor of 25 to 30 without sacrificing the efficiency of the fiat-search method. It essentially 
involved using hierarchically staged analyses on two template grids instead of one dense grid. 
The first layer was that of trigger stage templates which were placed more sparsely than 
the second stage templates. Any crossing over the trigger threshold would be followed up 
locally by the fine bank. This trigger threshold must be carefully chosen such that not only 
could false alarms be kept in control, but also the detection probability of a genuine signal 
will be higher than a prescribed minimum (usually ~ 95%). The need for further improving 
upon the efficiency of implementing the matched filtering method is more pressing at present 
than ever - mainly because of our experience with real interferometer data over the recent 
past. Some of the benign assumptions which are used in prescribing the canonical search 
actually do not hold true in real time data - for example the stationarity and Gaussianity 
of interferometer noise is a key assumption to almost all the previous work. However, drifts 
in the noise power spectra and tail features therein severely undermine the efficiency of the 
matched filtering algorithm [19]. Thus hidden costs are incurred in carrying out even a fiat 
search where vetoing [17] is now prescribed to curb high unprecedented false alarm rates. 
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The TAMA data analysis team [18] had to put such vetoing even in the trigger stage of the 
hierarchical search. Such vetoing techniques add to the total cost of the online computational 
budget. An alternative implementation to the matched filtering has been proposed recently 
using the Fast Chirp Transform (FCT) method [20]. The complexity of this algorithm is 
comparable to that of the conventional matched filtering technique. It may become simpler 
under certain circumstances if it is possible to transpose the order in which the associated 
2-D FFT is calculated. To date, however, this has not been explored further. 

In this paper, we propose a faster implementation of the canonical hierarchical search 
method. Because most of the power in the chirp signal is at low frequencies - the power 
in the signal falls off as /~^/^ where / is the frequency - one can truncate the Fourier 
transformed data at a relatively lower frequency (we choose 256 Hz) and nonetheless retain 
sufficient signal power (~ 92%) [24]. Then the key idea is to sample the data at a lower rate 
(512 Hz) in the trigger stage which gives us smaller FFT data sets to work upon. Lowering 
the first stage sampling rate also leads to a changed filter placement scenario where the 
absence of higher frequency modes of the GW chirp causes the ambiguity function to fall 
off more gradually and thus individual trigger templates can cover a bigger volume of the 
parameter space. This also leads to further computational advantage since the number of 
trigger templates is reduced. However, the lowered trigger thresholds and reduced signal to 
noise ratio due to this method will tend to increase the false alarm rate. We discuss these 
issues and supplement our solution with simulation results. In an earlier paper [24] we had 
outlined some of the above ideas. However, the scope of that paper did not include such 
complications as rotation of the parameter space, innermost stable circular orbit (ISCO) 
frequency cutoffs etc. Here, we take into account all these realistic scenarios to propose an 
extended hierarchical search (EHS) algorithm which can reduce the computational cost by 
a factor of > 65 over a flat search. The family of restricted, spin-less 2PN family of chirp 
waveforms is used for our computation. We also use the target noise power spectral density 
of initial 4 km LIGO interferometers in our simulations [32]. 

II. THE FLAT SEARCH 

In this section, we at flrst gloss over the necessary background and notation in order 
to make the paper self contained. We then estimate the online computational cost of the 
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flat search. This will prepare us for the EHS and comparing the cost advantages thereof. 
Generic time domain functions h{t) will be denoted by h{f) in the Fourier domain, where 

/■oo 

h{f) = / dt h{t)e^'''f'. (2.1) 

J—oo 

Engineering and much of LIGO analysis software uses the opposite sign for the exponent 
in (2.1). We however maintain this notation for consistency with the published literature 
[19, 29]. 



A. Matched filtering 

The stationary phase approximated Fourier transform h{f) of the spin-less, restricted 
second post-Newtonian binary chirp is given up-to a constant factor J\f by 



h{f)=^^f-'/'exp^ 



■^-$„ + *(/;M,7;,t„) 



(2.2) 



where M = mi + m2 is the total mass of the binary system, mi and m2 being the individual 
masses of the stars; r/ is the ratio of the reduced mass to the total mass; $a and ta are 
respectively the arrival phase and the time when the frequency / attains the fiducial value 
/„. The factor J\f depends on the masses, /„ and the distance to the binaries. The function 
"^{f; M, T), to) describes the phase evolution of the inspiral waveform and is given by 

*(/; M, n, Q = 2nft^ + ^ [(TrM/)-^/^ + + ^r^j (ttM/)"^ 

/ 2/s / 15293365 27145 3085 ^\ , , , i/s1/^on 

-16.(.M/)-/= + + — , + — (.M/)-'/3j (2.3) 

It is important to distinguish between the nature of parameters that appear in the above 
form of the GW chirp. The two mass parameters fl = {M, rj} are the dynamical parameters 
and determine the shape of the chirp. Exclusive to this set are the 'offset' parameters 

— * 

A = {ta, which determine the duration of the chirp or the end-points in the time series. 
The latter can be quickly estimated in the matched-filtering paradigm without having to 
construct template banks over them. For example, spectral correlators using FFTs allows us 
to estimate correlations at all time lags and thus estimate ta- Similarly, using the quadrature 
formalism to construct template banks, the initial phase can be estimated analytically. 
However, the dynamical parameters need to be tackled by building a grid of templates which 
span them. In future, our parameter-space will refer explicitly to these mass parameters 
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unless otherwise mentioned. To ease the computation, a new set of time parameters {tq, t^} 
that are functions of the masses is chosen in which the metric defined over the parameter 
space is approximately constant. They are related to {M, rj} by 

In other words, the template bank is set up over p, = {to^Ts}. The form of (2.2) allows us 
to write the explicit quadrature representation of the i-th template as 

Hf; i^i, ta, ^a) = hcif; jli, ta) COS + h,{f; jli, ta) siu (2.5) 

where, 

hc{f;iii,ta) = ihsif; fli,ta) 



(2.6) 



The normalization constant jV in (2.5) is fixed by demanding unit norm of both he and 
hg i.e. {he, he) = {hs, hg) = 1, where the scalar product (a, b) of two real functions a{t) and 
b{t) is defined as 

(M)^2fd/ '-'^W>;- . (2.7, 

Jfl ^n[J> 

where, we use the Hcrmitian property of Fourier transforms of real functions. 5'„(/) is 
the one sided power spectral density of the noise of initial LIGO detectors [32]. Further, 
< / < /u is taken to be the effective spectral window for computing the scalar product. 
The lower frequency cut-off // is dependent on the sensitivity of the detector to seismic 
vibrations and is taken to be 30 Hz, whereas the upper frequency cut-off is equal to half 
the full sampling frequency taken to be 2048 Hz in this paper. In practice, the upper cut-off 
frequency for the signal is held back from the Nyquist rate by a factor ~ 0.8. Therefore, for 
normalizing the templates, we choose an upper cut-off frequency fc which is the minimum 
of 800 Hz and the so called ISCO cut-off frequency. The latter corresponds to the last 
stable circular orbit at which the inspiral phase is deemed to end and is relevant only for 
high masses when the ISCO frequency is less than 800 Hz. For low masses where the ISCO 
cut-off frequency is more than 800 Hz, the loss in SNR due to this band hmiting is ~ 0.14%, 
which is negligible. 

Having thus set up a template bank, the statistic p for an output signal s{t) of the 
interferometer is the maximum of the signal to noise ratio (SNR) over all the templates 



(labeled by i) and the time-lags ta- Thus, the statistic p is given [33] by, 



max 



max J (s, hc)'^ + (s, hsY 



(2.8) 



which is then compared with a pre-determined threshold C,. Any crossing above this threshold 
is recorded as a candidate event in fiat search scenarios. Let the SNR be maximized by the 
i-th template and at the time-lag ia- The phase offset of the chirp can then be calculated 
as: 



tan 



(g, hs) 
{s,hc) 



(2.9) 



A detailed description of this procedure can be found in [11-14, 29]. 



B. Computational cost of flat search 

Since jl is spanned by a discrete grid of templates, it might so happen that a signal 
arriving at the interferometer has parameters that do not correspond to a grid point. In this 
event, the correlation of such a signal with the nearby templates will fall below the maximum 
obtainable value. The degree of discretisation is governed essentially by the rule of thumb 
- that in the worst case, the loss in correlation shall not exceed 3% of the maximum. This 
corresponds to a loss of event rate of roughly 10% which is an agreed upon number in the 
community. Given the range of masses which act as search limits and set the boundary 
of jl, a grid of templates is constructed with the above idea. The main computational 
cost incurred in the fiat search method outlined previously is in performing the Fast Fourier 
Transforms (FFTs) and multiplying the elements. If a data segment of duration T , sampled 
over N points is parsed through a flat search, utilizing A/t templates, the order of floating 
point operations is 

Kp--QKNlog^{N). (2.10) 

If these many operations are to be carried out in real time, the computational speed in 
floating point operations per second (Flops) is given by 

A/kop = (2.11) 

pad 

where Tpad is the padding length in the data segment. Computationally it is advantageous to 
use stretches T that are significantly longer than the duration ~ 3 x ATchirp, where ATchirp 



8 



is the duration of the longest chirp. However, in order to preclude losing chirps that occur 
within ATchirp of the ends of the longer data stretch T, it is necessary to overlap adjacent 
epochs of data by this amount. Thus the effective computational time available to analyze 
a data stretch of duration T is Tpad — T — ATchirp- 

For the mass range of 1 Mq < M < 30 M©, the longest chirp signal occurs when each of 
the component masses is equal to 1 Mq, and is ~ 95 sec in duration. The data segments are 
taken to be 512 sec each (greater than three times ATchirp), so Tpad ~ 417 sec. The parameter 
spaces for the individual mass ranges of 1 < M < 30 and 0.5 < M < 30 in 
the {tcTs} coordinates are shown in Fig. 1. 

The mass range is set to 1 Mq < M < 30 Mq in this paper and is typical for the initial 
detectors. The lower mass limit is chosen relatively higher, because higher masses lead 
to higher SNR (the SNR scales as M^/^), in order to compensate for the relatively lower 
sensitivity expected for the initial detectors. Choosing the fiducial frequency /„ = 40 Hz, the 
area of the parameter space is 8.5 scc^. If the parameter space is covered by a rectangular 
lattice of templates, obtained by using the largest inscribed rectangle within the contour 
of 3% mismatch, the density of templates is ~ 1300/ sec^. Multiplying the density by the 
area gives about At = 11,050 templates which are needed to cover the parameter space. 
One typically takes 512 sec data trains sampled at 2048 Hz which gives N — 2'^^. With a 
padding length of Tpad ~ 417 sec and using Eq. (2.10, 2.11), we see that the onhne Flop 
rating required is ~ 3.2 G-Flops. In addition to the cost of FFTs, there arc overhead costs, 
incurred in element by element multiplication of the data and template vectors and data 
preprocessing. However, we neglect them in our analysis since they are small compared to 
the FFT costs. 

The above estimate however does not take into account boundary effects, where the 
templates can spill out of the searched parameter space because of imperfect tihng. This 
effect can considerably increase the number of templates. The current LIGO template 
placement code [14] was run for the above parameters and was found to generate almost 
three times as many templates as in this simple estimate. Since the current code sets the 
fiducial frequency equal to the lower cut-off frequency, fa was set at 30 Hz for the purpose 
of obtaining the number of templates. 

Lowering the mass limit, increases the area of the parameter space and thus the compu- 
tational cost. The results are described in Table I. 
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FIG. 1: Parameter spaces for the individual mass ranges 1 — 30 Mq (solid line) and 0.5 — 30 Mq 
(broken line) in (tqjT^) co-ordinates with fa = 40 Hz. Note that the tq should be multiplied by 
{fa/ fi)^^^ in order to get the Newtonian chirp time. 

TABLE I: Estimates of number of templates and online computational cost for flat search. Tem- 
plates are placed using the method prescribed by Owen and Sathyaprakash [26]. The prescribed 
minimum mismatch level is set at 0.97. The area of the parameter space is computed for the 
fiducial frequency of fa = 30 Hz. The upper mass limit in all the cases is set to 30 Mq. 



Minimum Mass Limit 


Area'^ 


Mt 


Aflop 




(sec^) 




(G-Flops) 


LOM0 


29.3 


3.34 X 10^ 


10.1 


O.5M0 


172.9 


1.64 X 10^ 


49.5 


O.2M0 


1687.4 


1.16 X 10^ 


350.0 



"The area of the parameter space scales as /a 

As mentioned before, the template placement code presently implemented employs effec- 
tively a rectangular tiling metliod. We note here that if a hexagonal closed packing method 
were used instead, significant reduction in the online computational speed can be accom- 
plished, because of the efficiency of this method in covering the parameter space is greater 
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than rectangular packing [21, 24, 31]. As a matter of fact, the hexagonal tiling method gives 
a template density which is about ~ 25% less, thus effectively reducing the required online 
computational speed by the same factor. 



III. EXTENDED HIERARCHICAL SEARCH 

The flat search described in the previous section is not computationally optimal. This 
is due to the fact that a grid of large number of templates is required in order to densely 
span fl, satisfying the criterion that the minimal correlation for two neighboring normalized 
templates docs not fall below a prescribed value of 0.97. 

It turns out that we can significantly reduce the computational cost by constructing 
two sequential grids of varying template densities instead of one [29, 30]. The philosophy 
behind this approach is the fact that one expects real events to be not only weak but also 
very rare. This implies that most of the time, one is only spending time in sifting through 
interferometer noise. In this sense, scanning the parameter space so thoroughly at each 
observational epoch is wasteful. On the contrary, if a trigger mechanism exists which would 
pick up only statistically likely candidates, then a higher parameter resolution search (flat 
search) can be performed locally on pre-screened candidates possibly containing a chirp 
event. Thus figuratively, the hierarchical search method is a layered search. The coarse 
bank layer is the trigger which flags a likely event. This is then followed up by the fine 
grid of templates a la flat search method described above. The difference is that only a few 
flne grid templates can accomplish the job of thoroughly verifying the claim of the trigger 
stage coarse grid templates. The computational cost thus is reduced as fewer templates are 
employed in the overall scheme. 

The above idea can be extended even further to include the time of arrival of the signal 
in the overall scheme of hierarchy as we shall now explain. Prom (2.2), one notices that the 
signal power spectrum is a decreasing power law in frequency : 

Wfoc/-^/^ (3.1) 

Prom (3.1), we see that most of the signal power is contributed from the lower frequency 
bands. An immediate consequence is that one can cut off the chirp at a lower cut-off 
frequency and re-sample it at a lower Nyquist rate without losing too much power in the 
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signal. In the EHS we propose to sample the trigger stage data train at a much lower 
sampling rate than that used in a flat search algorithm. As a proof of principle, we notice 
that the fractional signal to noise ratio p as a function of an arbitrary upper cut-off frequency 
/cut-off is given by p(/cut-off) = Hfcnt-os)/I{fc), where 



If the ISCO cut-off is used, /cut-off must be replaced by the appropriate plunge cut-off 
frequency. However, this is applicable for high masses in the first stage (M > 16 M©), 
which is a tiny fraction of the parameter space - implying a negligible effect on the overall 
gain factor. A significant fraction of the SNR can be recovered from a relatively small value 
of /cut-off- Specifically, for /cut-off — 256 Hz, (i.e. a sampling rate of /^amp = 512 Hz) almost 
92% of the signal power is recovered. The most obvious advantage of lowering the Nyquist 
rate in the first stage is that we have to contend with fewer points in computing the FFTs, 
(the cost of which scales as A^log2A^), leading to a reduction in first stage computational 
cost. For example, a chirp cut-off at /cut-off = 256 Hz reduces the cost of FFT by almost a 
factor of 4. 

The equation (3.2) correctly estimates the loss in SNR if the time of arrival parameter 
ta of the signal coincides exactly with a sampled time in the time-series. If, as will be the 
case in general, ta is not a sampled time, then there will be an additional loss of SNR due 
to mismatch in time of arrival parameter. Considering the 'worst case' scenario when ta of 
the signal lies exactly between two consecutive time samples, we numerically estimated this 
additional loss in SNR in two different cases : (a) when the chirp times of the signal match 
exactly with those of a template and (b) when they lie exactly in between two neighboring 
templates. In the former case, we found that the additional loss in SNR is less than 5% 
and in the latter case, a little more than 2%. It is to be noted however, that our template 
placement schema (see Sec IIIC) introduces some overlap between neighboring templates, 
which effectively reduces the maximum mismatch. We find that this adequately compensates 
for the additional loss in SNR and no further reduction of first stage threshold (prescribed 
in Sec IV A) is necessary. 

In the earlier method given by Mohanty and Dhurandhar [29], the hierarchy was on the 
two mass parameters alone in the sense that a coarse template bank spanning the mass 
parameters was used as the trigger, followed up by a fine template bank. Analogously in the 




(3.2) 
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EHS algorithm, we not only retain this hierarchy of templates spanning the mass parameters 
but also extend it to the time of arrival by coarser sampling of the signal at the trigger stage. 
It is in this sense that the EHS should be thought of as a three dimensional hierarchical 
search. 

But quite aside from this, lowering the Nyquist rate also affects the contours of the 
ambiguity function Ti. which in turn affect the template coverage in the first stage. Also 
a lowered signal power in the first stage will require lowering the trigger threshold. This 
will affect the false alarm rate due to noise in the interferometer and in turn the overall 
computational cost. In the next few subsections, we systematically address these issues and 
prescribe the optimum method of implementing the EHS algorithm. 

A. Setting up the fine bank 

In this subsection, we will discuss the setting up of the fine bank templates and the second 
stage threshold. 

The templates in the second stage or the fine bank stage of the EHS is set up in a way 
that is identical to the flat search method discussed in the earlier section. The ambiguity 
function H. centered at the is defined as the correlation between two neighboring normalized 
templates whose mass parameters differ by A/I: 



The contours of Ti are closed curves and the area enclosed within this curve will be frequently 
referred to as a tile. The shape of this tile depends on the contour level V where < F < 1. 
For values of F close to 1, the tiles are elliptical in shape. However, as F significantly differs 
from unity, the shape can be quite irregular. This is important from the point of view of 
template placement where the parameter space must be efficiently covered by these tiles. 
In the fine bank, the templates are laid out in such a way that nowhere does F fall below 
0.97. For such high values of F, the ambiguity function can be approximated quite well by a 
Taylor expansion of H. truncated after the quadratic term. This allows analytic calculation 
of the dimensions of these elliptical tiles and placement of templates [25, 26] over jl. This is 
not true, however, for smaller values of the match level. The number of templates indirectly 
affect the minimum observable signal strength in the second stage of the hierarchical search 




(3.3) 
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method and thus the overall minimum detectable signal strength. 

In order to proceed with the numerical analysis, the noise in the interferometers is as- 
sumed to be a stationary, zero-mean Gaussian random process. Thus the integrated signal 
to noise ratio p calculated using (2.8) should also be considered to be a reahzation of a 
random process whose probability distribution depends on the presence or absence of the 
signal. In the presence of a signal, the distribution is a Rician, whereas in the absence of 
the signal, p is Raleigh distributed. The detection strategy is to set a threshold C such that 
given one instance of p, a decision can be made with some statistical significance, whether 
or not, it was drawn from a Raleigh or a Rice distribution - which would correspondingly 
imply the presence or absence of a GW signal. 

Given a sufficiently high C,2 (in the second stage), even in the absence of a signal, there 
may be some crossings of p above this value leading us to incorrectly deduce the presence of 
a signal. This threshold is thus primarily set such that the probabihty of these false alarms 
is no more than once per year. Thus, C2 is set at 



where, the data train is sampled at /samp Hz and Tyj. — 3.15 x 10 are the seconds in a 
year. The factor e arises from the correlation between templates and correlations between 
successive time-lags in the filtered output [28]. Because the chirp filter is a narrow band 
filter and band-passes the data, the inverse Fourier transformed data in the time domain 
becomes correlated as it is essentially generated from few frequency bins. Thus e gives the 
fraction of the number of independent Gaussian random variables in the total filter bank 
output. Prom our simulations we find that e ~ 0.6 in the second stage where the templates 
are closely spaced, and e ~ 0.9 in the first stage where the templates are farther apart. 
If there had been no correlation between the templates, we could have set e = 1. Using 
Eq.(3.4), and the fact that the sampling frequency in the second stage is 2048 Hz, we find 
that the second stage threshold must be put at ^2 = 8.2. If this threshold coincides with the 
strength of minimum observable signal Sohsi then the detection probability is just 50%. In 
order to achieve higher detection probability, the minimum observed strength of the signal 
should be somewhat greater than ^2- Following [30], we compute the detection probabihty 
by considering the joint probability of two neighboring templates around the signal. Then 
for attaining a minimum of 95% detection probability, the minimum strength of the signal 




(3.4) 
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should be S'obs = C2 + where ~ 0.7. The second stage templates can thus detect 
a signal of strength S'obs = 8.9 with a detection probability of 95% or greater. From the 
minimum match that we have chosen (97%), we see that this corresponds to a signal of 
minimum strength ^'min = 9.2 arriving at the detector. 

B. Setting up the trigger stage templates 

The data trains in the trigger stage of the extended hierarchical search are sampled at 
512 Hz, and the template chirps are terminated at 256 Hz, so as to avoid power aliasing 
effects in the FFTs. These templates are used as triggers. The neighborhood of a trigger is 
then searched using the second stage fine bank templates. 

We begin by describing the contours of Ti, when the GW chirp is cut-off at a lower 
frequency, i.e. when /cut-oflf = 256 Hz. The contours at different F levels are not only 
irregular in shape, but cover larger tile areas compared to those with cut-off /cut-off — 800 Hz. 
The basic reason for the irregular shape of tiles is that, the higher order post-Newtonian 
(FN) terms that appear in the phase function \E' in Eq. (2.3) decay slowly with the frequency 
/. The phase function ^ appears implicitly in the integral of the ambiguity function. While 
the Newtonian term (the Tq term) falls off as the T2, T3, T4 (1 FN , 1.5 FN, 2 FN) terms 

fall off as /~^, /~^/^, f~^/^ respectively. The 1.5 FN and 2 FN terms contribute to the phase 
significantly at high frequencies since their fall off is relatively slower. Thus an upper cut-off 
of 256 Hz verses 1024 Hz produces tiles that differ significantly in shape. The irregular and 
asymmetric shape of the contour can also be viewed as an effect of the curvature of the 
manifold. Because the 2 FN or the term has not decayed, it contributes to the metric 
making it non-constant in the {tq, T3} space. The bigger tiles are essentially due to the 
fact that, it is the higher frequencies which resolve between the masses, and if these are cut 
off, the ambiguity function decays more gradually, resulting in bigger tiles. At F ~ 0.8 the 
tiles with /cut-off = 256 Hz are larger in area than those with /cut-off = 800 Hz by about 
a factor of 2. Moreover, because the contours are large, non-local effects are important, 
compounding the curvature effects and adding to the irregular shape of the contour. 

In the earlier hierarchical scheme, the ambiguity function in both the trigger and fine 
bank stages were the same. The trigger stage templates were constructed by letting the 
trigger threshold drop from C,2 to a lower value Ci such that a balance was struck between 
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the following opposing forces : Ci was low enough to allow larger tiles to cover the parameter 
space with fewer templates and thus reduce the computational cost, but high enough so that 
false alarm crossings due to noise do not compromise the cost advantage by increasing the 
second stage cost. Signal power in the trigger stage was the same as in the fine bank stage 
since the sampling rate was the same in both stages. 

In the EHS implementation of the hierarchical search, we are faced with a different 
proposition - we have reduced signal power in the first stage due to lower cut-off frequency. 
Also, we have a different 7i in the two stages. Given these differences, we need to devise 
how to set up the coarse grid of templates. We now describe the template placement for the 
first stage of the EHS algorithm. 

C. Setting up the trigger stage tiles 

The tiling problem stated from an operational point of view is the following : given 
the prescribed match level T for the trigger stage, we need to tile the parameter space 
efficiently with closed contours of Ti. The following conditions must be satisfied [31] for 
efficient template placement: firstly, minimal mismatch must be satisfied i.e. 7i >r inside 
and on the contour. Also, templates should be placed such that there are no uncovered 
spaces (no holes) and there is minimal overlap between templates. For computational ease, 
the templates should preferably lie on a regular grid or at least a piece-wise regular grid. 
As stated earlier, the ambiguity function Ti. is quite wide at low frequency cut-offs since 
the mass resolution is poor in absence of high frequency components. We choose F = 0.8 
contours in the trigger stage to tile the available parameter space, since the total cost is 
minimum for this value of F. The results quoted at the end of the paper are also for other 
values of F but close to 0.8. It is quite unwieldy to use these contours prima facie to tile 
the parameter space. A possible solution is to carve a regular polygon - e.g. a rectangle 
out of these contours and use them as tiling blocks. To minimize the computational cost, 
we must choose the rectangle as large as possible. The details of a template placement 
(tiling) algorithm are now described. The demands made upon the tiling algorithm are 
several - (i) given a F = 0.8 contour centered at some point in the parameter space, the 
largest rectangle should be identified automatically and used as the representative tile at that 
point, (ii) having placed a template at a point in the (tq, t^) space, the optimum position of 
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the neighboring templates should be estimated such that the template placement conditions 
are satisfied, and lastly (iii) the parameter space in (tcTs) co-ordinates has non-trivial 
curvature because of which the contours are asymmetric and also have differential rotation 
as a function of {to,Ts) and one cannot set up a global regular grid for arbitrary values 
of r. The tihng scheme should correct for these differential rotations so that there are no 
uncovered spaces due to differential rotation of the rectangular tiles. The description of 
this tiling algorithm proceeds in the following logical sequence - we first describe a method 
of carving a large rectangle given a contour of at a specified value of F. Thereafter a 
method for placing neighboring templates is described. This method automatically takes 
into account the differential rotation of the templates. Finally, an overall placement schema 
is given which involves 'stacking'. 

The steps involved in identifying a sufficiently large rectangle given a F = 0.8 contour is 
illustrated using one centered at (tq = 15.0 sec, = 0.88 sec). The rectangle must ahgn 
itself along the length of the contour. So the first step is to identify the orientation of the 
rectangle. This is done by fitting straight lines to the concave sides of the contour with 
respect to the center. The points are so chosen that it also includes the one which is radially 
closest to the center as can be seen in Fig 2(a). If the slopes of these two lines are tan 6i and 
tan ^2 respectively, then the side of the rectangle is chosen to have the average slope tan^: 

tan^ = ^(tan^i + tane2). (3.5) 

The next step involves drawing two parallel lines with their slope tan^, such that they (a) 
are as far out as possible from the center (b) intersect the contour at only two points (c) 
lie completely inside the contour. This is clearly shown in Fig 2(b) where the intersection 
points are labeled as A,B,C and D. 

Two adjustable parameters are used to fix the distance of each side of the rectangle from 
the center of the contour. These parameters should be chosen such that conditions given 
above are satisfied. Due to the peculiar shape of the contour, the lines (see Fig 2(b)) may 
intersect the contour at multiple points. This must be avoided by fine tuning the parameters. 

The final step is to read off the co-ordinates of the rectangle by dropping suitable per- 
pendiculars. This is shown in Fig 2(c), and the resultant rectangle is shaded. In general, if 
the axes are scaled differently, the rectangles would look like parallelograms. 

By following the above steps, a sufficiently large rectangle can be identified from within 
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FIG. 2: This figure illustrates the sequential steps for obtaining the inscribed rectangle and placing 
the templates. Units of tq and are in seconds, (a) The orientation of the rectangle is first 
determined by fitting straight lines to points around the concave sides of the contour, (b) Then 
lines with slope m = tan 9 are drawn inside the contour on either side. These intersect at four points 
A,B,C,D. (c) The rectangle is then extracted from the four points by dropping perpendiculars, (d) 
Figure illustrating the placement of two neighboring templates along some curve /(to,T3) = 
parallel to the = 1/4 line. Note the slight overlap between the templates due to A9 = 0.025 
correction and the negligible spillage of rectangle below the equal mass line. 

any prescribed contour. Before placing neighboring templates, one needs to correct for the 
differential rotation of the templates. This is achieved with an adjustable angular param- 
eter in the schema which deliberately induces some overlap between nearest neighbor 
templates by reducing the effective length and breadth of the rectangle that is used. If the 
actual dimensions of the rectangle are / and 6, the A6'-corrected values /' and h' are given by 

(3.6) 

(3.7) 



h' = it&n 



I' = h tan 



tan-i(-) - 



tan"i(-) - A^ 
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The parameter A^^ should be chosen very carefully so that minimal overlap condition is 
satisfied. The curves in Fig 3(a) show the overlap fraction between neighboring rectangles 




Fraction of templates 



FIG. 3: (a) Percentage overlap between neighboring templates after A^-correction. The values 
are /S.9 = 0.025 (top) and = 0.05 (bottom). Percentage overlap is defined as the ratio of area 
common to two neighboring templates to the area of any one of them expressed as a percentage. 
Note that fewer templates are required for smaller parameters, (b) Cumulative distribution 
of templates according to the A0-induced area overlaps between neighbors for A0 = 0.025. The 
average percentage overlap in this case is 5.7%. The dotted line shows that ~ 63% of the templates 
have percentage overlaps less than this average value. 



as we move along the tq axis for A^^ = 0.05 and A0 = 0.025 respectively. We need to tweak 
the parameters such that the overlap approaches zero as quickly as possible without ever 
going below it. This will guarantee a minimum number of templates. Fig. 3(b) shows that 
for AO = 0.025, most of the templates have neighboring overlaps of ~ 5% of their width. 

One would like to place the templates along some curve /(tq, t^) = 0. For example, this 
curve could be the equal mass line [rj = 1/4). Once the curve is chosen, the center of the 
neighboring template is found by moving out by a distance Atq along the tq axis given by 

cos P 



Aro = 6'- 



(3.8) 



sin(^ - (3) ' 

where, b' is the A6'-corrected breadth and the local slope of the curve /(tq, t^) = is tan j3. 
For the t] = 1/A line (mi = 1712), the slope can be analytically determined : 



tan P 



dTo/dM' 
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(3.9) 



where 2 M0 < M < 60 Mq is the total mass of the compact binary system. The tq co- 
ordinate of the center of the new rectangle is given by Tq + Atq. The co-ordinate can be 
found from the equation of the curve. 

This is a dynamic method of placing the templates. However, we do not go to such 
lengths in placing the neighbors. Since the average orientation of the rectangle is known, a 
constant fudge-factor in multiplying b' is quite sufficient. In this case, one relies entirely on 
the dynamical change in b' across the parameter space. Choosing Atq ~ 1.286' is sufficient 
for our purposes. This choice will slightly increase the first stage cost. The placement of two 
neighboring templates is shown in Figure 2(d). Note that the overlap between the rectangles 
is small. 

Our overall strategy of template placement is to build stacks of templates as can be seen 
in Fig 4. Here it is worth noting that all earher schemes [13, 31] start by placing templates 
on the equal mass [r] = 1/4) line. However, the space below this line is unphysical - as such, 
almost 50% of the templates centered on this line will be wasted and the final result is that 
we end up using too many of first stage templates in covering the parameter space. We can 
ensure more efficient coverage by constructing templates on a line parallel to the 77 = 1/4 
line but which is offset by parameters £p and 9 as : 



where £p is so chosen that the rectangle constructed spills minimally below the equal mass 
line. Due to the asymmetry of the contour, ip need not be equal to half the length of the 
inscribed rectangle. 

By shifting up the templates in this way, a larger area of the parameter space is covered 
than if their centers were placed on the equal mass line. Even though one needs to place 
templates all along the curve for the first stack - for the subsequent stack, templates need 
to be placed only between points of intersection of the top of the templates of the earher 
stack with the parameter space boundaries mi = Mmax and mi = Mmin. Thus the first 
stack completely covers the low mass end of the parameter space from tq > 7 sec. Also the 
high mass end of the parameter space, Tq ^ 2sec is covered by the first stack. Having placed 



To = To,^=l/4 +^pC0s6l, 



(3.10) 



(3.11) 
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FIG. 4: Figure illustrating the stacks of templates on top of each other covering a part of the 
targeted parameter space. The dark lines are the boundaries of the parameter space for the mass 
range 1 Mq < mi, m2 < 30 Mq. Due to the large size of the rectangles, the hierarchical search will 
actually cover more of the parameter space than the flat search it replaces. Note that this figure 
is a zoomed view of the first 20% of the tq domain. 

the first stack of templates in this way, the next stack is constructed by choosing the curve 
that is parallel to the first stack, but offset further according to the sizes of the templates in 
the previous stacks and adjusted for minimal overlap. The two stacks completely cover the 
parameter space. We find that the total number of coarse templates required is JV^^^ ~ 510 
for the contour level of F = 0.8. 

If the lower mass limit had been smaller, say, 0.5 Mq, then more than two stacks would 
have been required. It is easy to see how this schema can be generalized for smaller compo- 
nent masses. The details of placement using the above scheme towards the high mass end 
of the parameter space is shown in Figure 4. 

The template placement schema described above stacks rectangular templates centered 
on curves parallel to the equal mass line, and the size of each tile covers a significant portion 
of the parameter space /2. As such, it is important that the length of templates in any stack 
does not vary too much. Indeed, the length of these tiles is seen to change by only a few 
percent over fl. However their orientations change by ~ 0.1 radians and is seen to be a 
dominant effect at the high mass end. The parameter A^^ takes care of this in our schema 
by introducing extra overlaps between neighboring templates. We thus find that the issue 
of varying template sizes is less of a concern than varying orientation of the templates. 
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This completes the formal description of the template placement algorithm. 
D. Simulation of EHS 

We performed Monte Carlo simulations to check the performance of the EHS algorithm 
against simulated detector noise with injected signals at different positions in the parameter 
space. 

Only the trigger stage of the EHS is of relevance in this context since, the fiat search used 
locally around a candidate event in the second stage is quite efficient in extracting relevant 
signals. The purpose of these simulations was to ascertain if the noise crossings were as 
expected as used in the analysis and design phase of the EHS algorithm. Secondly, since 
template placement method used in the first stage does not center the tiles on the 77 = 1/4 
line, we would like to check how efficiently signals with parameters close to this line are 
triggered by the first stage templates. 

Wc chose a bank of A/"/^"* = 11 trigger stage templates covering the fraction of the pa- 
rameter space where the ambiguity function is non zero. For case of discussion, we label 
these from Ti to Tu (see Figure 6). The simulation was carried out in two phases. In the 
first phase we parsed data segments of 512 sec duration containing only simulated Gaussian 
stationary noise with LIGO-I power spectral density through these templates. In the second 
phase, a chirp signal with a predetermined maximum SNR p was added to the noise. The 
trigger stage sampling rate was chosen to be 512 Hz, such that the number of points in each 
data segment in this stage was A^i = 2^^. The spectral window for this stage is taken to be 
30 Hz < / < 256 Hz. 

The noise in the interferometers is assumed to be a Gaussian, whose characteristics remain 
constant in time. We used the LIGO-I noise spectral density to color samples of white 
Gaussian noise. From (2.8), one notices that in presence of noise only, the filtered SNR 
from each of Af^^^ = 11 templates is expected to be realizations of a random process with 
Raleigh distribution. As remarked earlier in the context of fine bank placement, the filtered 
output from each of the templates is correlated. We account for this fact by introducing the 
quantity < e < 1, which is used to reduce the effective number of independent random 
variables. Using these, one can calculate the probability of the test statistic crossing a preset 
threshold Ci- This is precisely the false alarm probabifity Qo{J^t^\Ci) a function of the 
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FIG. 5: False alarm probability estimates as a function of the first stage threshold for Nt = 11 
templates. The solid curve is obtained from (3.12) with e = 0.95, whereas the solid circles are the 
Monte Carlo estimates. 

number of templates used and the first stage threshold. Assuming Gaussian noise, it can be 
expressed as 

Qo(ArW,Ci) = 1 - exp[eAr«iVpadexp(-|)], (3.12) 

where A^pad = ^pad x /iamp is the number of points in the non-overlapping segments of 
adjacent analysis epoch, Tpad- This curve is then compared with the result obtained from 
simulations and e is chosen to make the analytical expression agree with the Monte Carlo 
results. The parameter e should be taken as a measure of the statistical independence of the 
templates; e = 1 means all the A/'/^''A^pad random variables are independent. In Figure 5, we 
have plotted the Monte-Carlo estimates of the false alarm probability. For the trigger stage 
of the EHS, where the distances between templates are quite large, we expect the assumption 
of statistical independence to hold to a large extent. Indeed, the best fit parameters for the 
curve gives e ^ 0.95. 

In the next phase of simulations, we added known signals to interferometer noise realiza- 
tions and processed the data through eleven templates. 

We chose our signal parameters at three characteristic positions in the parameter space 
with respect to the centers of the templates such that they represented three different de- 
tection scenarios. In case (A), the signal parameters were chosen to be favorably placed 
with respect to the filters. This was done by taking a point close to the center of Tg (see 
discussion in caption of Fig 6). In case (B), we chose the signal to have parameters such 
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FIG. 6: The positions of A/"t =11 templates used in the simulation and the relative positions of 
the signals injected into the interferometer noise. The templates are labeled Ti to Tn. See the 
text for a description of the three simulation cases A, B and C. 

TABLE II: Monte Carlo estimates of detection probability and percentage of triggers generated by 
individual Nt = 11 templates in three different cases : (A) signal lies on Tg and close to template 
center (B) signal parameters lie on Ty and are quite far away from the center of the templates. 
Actually in this case the parameters lie close to the equal mass line. (C) The parameters lie exactly 
between T5 and Tg, and close to the equal mass line. The templates are individually labeled from 
Ti to Tn. 



Case Qd{%) 


ryn rri rri rri ryn rri rri rri rri rri rji 

-Ll -t2 J-3 -'4 -^5 -^8 -ig-flO-ill 


A 99.1 
B 97.2 
C 95.7 


0.0 0.0 0.0 0.0 0.3 99.2 0.5 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 90.0 9.9 0.1 0.0 0.0 
0.0 0.0 0.0 0.2 53.9 44.3 1.2 0.2 0.2 0.0 0.0 


Noise 3.7 


0.3 0.7 0.3 0.3 0.3 0.2 0.2 0.5 0.2 0.2 0.5 



that it lay in the area bounded by T7, but placed far away from the template centers - close 
to the equal mass line. In case (C), the signal was chosen to lie close to the equal mass line, 
exactly between T5 and Tg. We believe that these scenarios are typical for templates placed 
anywhere in the (tq, T3) parameter space, as it is more or less flat. 

When only noise was processed, the trigger stage extended hierarchical search templates 
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registered on an average 0.3% false alarm probability per template. However we found that 
these alarms were isolated events in the sense that they were not triggered by multiple 
adjacent templates in unison. On the contrary when a bona fide signal was present, there 
were always a cluster of nearby of templates which showed crossings above the trigger stage 
threshold. For the purposes of this paper, we choose the crossing with the maximum SNR 
for determining the trigger template; that is we select just one template in a given realization 
of the interferometer noise. This is like a one point estimate of the signal parameters when 
multiple trigger templates have shown crossings. The Monte Carlo simulation described 
below justifies this approach. 

The results of this phase of the Monte Carlo simulation are summarized in Table II. 
In absence of a signal, we find that all the templates are equally likely to register a false 
alarm, the probability of which is in agreement with (3.12). In presence of a genuine signal, 
one notices that when the overlap between the signal and templates is high (case A), the 
signal is almost always correctly triggered (by Tg). When the distance between the signal 
and templates is large (case B) then there is significant loss in correlation. In this case, we 
notice that about 10% of the triggers spill over to the adjacent templates. Finally, when 
the signal hes exactly between two templates (case C), the triggers are equally distributed 
between the adjacent templates. 

Given these results, we can draw a strategy for chalking out the area over which the 
local flat search needs to be carried out in the second stage around every flrst stage trigger 
template. We introduce a factor a which quantifies this idea, a is the ratio of the area of the 
parameter space searched in the second stage for a given click to the area of the first stage 
template. Thus, a — 1 means the fine bank search is carried out within an area of the first 
stage template. But as the simulations show we need to search beyond this region with the 
fine bank, and thus a needs to be chosen greater than unity in most of the parameter space. 
The parameter space boundaries can however make a < 1 especially at the low mass end 
where the parameter space tapers and very little area within the parameter space is required 
to be searched. Below we quote the the value of a averaged over the entire parameter space. 
If the interferometer noise is adequately described by a Gaussian random process, then our 
simulations bear out that searching two adjacent templates around the clicked template(s) 
should account for the correct location of the signal to < 1% error. Thus it is sufficient to 
set a = 2.5. 
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In presence of non-Gaussian features in the interferometer noise, the behavior of triggers 
can be quite unpredictable and we anticipate formidable difficulties in choosing the neigh- 
borhood for the fine search. Strategy based on the actual noise behavior will have to be 
devised. More work needs to be done towards a robust estimation of signal parameters from 
the trigger stage output itself so that the cost of the second stage can be minimized. 

IV. COMPUTATIONAL COST OF EHS ALGORITHM 

The computational cost in the EHS algorithm comes from (a) processing the data segment 
through all the first stage trigger templates, (b) a fine bank search in the neighborhood of 
all the triggers generated in (a) above and (c) cost of preprocessing and passing putative 
events between coarse and fine banks of the hierarchical search. In the cost analysis, we do 
not consider (c). However, we pause to note that if (c) is sufficiently large, it can completely 
nullify the gains we claim the extended hierarchical search method makes over a flat search. 
It maybe possible to setup the problem such that in the worst case, it is no worse than a 
fiat search. 

Let a data segment of T sec duration be sampled at /iamp Hz in the first stage and 
/ifmp Hz in the second stage. The total fioating point operations performed in the first stage 
is given by 

where, M^^^ is the total number of first stage templates employed. Furthermore, the fre- 
quency of triggers generated in (a) depends on the first stage trigger threshold d. As we 
shall see later, Ci ~ 6.05 is the optimum first stage threshold. This corresponds to the 
optimum size of first stage tiles at F ~ 0.82 contour level, which gives us At*^^^ = 560 in 
this stage. If such a threshold is put, the average number of crossings fic due to noise alone 
(assumed Gaussian and stationary ) is given by 

Uc^Mi'^ Qo(l,Ci), (4.2) 

where (5o(l, Ci) is the false alarm probability for a single template and is given by, 

go(l,Ci) = 1 -exp[e7Vpexp(-|)], (4.3) 

where e ^ .95 was the value obtained from the simulations which we mentioned in the 
previous section. 
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From this false alarm probability, the second stage floating point operations coming only 
from false alarms is estimated to be, 

M^^ = n,a^ (6/ifipTlog2(/i^ipT)) , (4.4) 

where 7 = jV/^V-^t^^ is the ratio of the number of templates in the second stage to the 
number of templates in the first stage, and a'y is the number of fine bank templates required 
to search the neighborhood of the trigger template that clicks. The Monte Carlo results 
indicate that o; = 2.5 should be sufficient. We round this up to the integer value 3. Estimates 
of the computational cost and the speed-up factor over the flat search for these values of a are 
tabulated in Table III. The computational cost estimates make the plausible assumption that 
the analysis will be dominated by the need to dismiss false alarms, rather than processing 
true events. Therefore, the computational cost incurred when a signal is actually present in 
the data segment is not considered in these estimates. In our analysis, we have considered 
the second stage cost to be given by Eq. (4.4). 

The online computational power for the two stages can be calculated by dividing the 
number of floating point operations m—1, 2 by the processed (zero-padded) length 

of the data segment Tpad^ 

J^S - -7^- (4.5) 

pad 

The total online speed Nf^°^^ is the sum of the two online speed for the two stages: 

A/'fS:?=A/'S+ A/Jot (4.6) 



A. Choosing the optimal C,i 

We consider data segments of 512 sec duration which accommodate chirps of maximum 
length 95 sec with sufficient padding corresponding to a minimum mass limit of 1.0 M©. We 
consider first and second stage sampling frequencies of 512 Hz and 2048 Hz respectively. The 
cutoff frequency for the chirp in the first stage is set at /cut-off = 256 Hz. Thus the number 
of data points to be processed are 2^^ and 2^° for the coarse and fine banks respectively. The 
signals of minimal strength that can be observed, assuming a minimum detection probability 
of 95% and 3% mismatch between templates turns out to be 9.17 [29]. The first stage iSmin for 
the above mentioned cut-off can be calculated to be 92% of 9.17 which is 8.44. If we choose 
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a mismatch level of F in order to tile the templates, the minimum S'obs seen by each template 
will be 8.44 X F. The first stage trigger level must be chosen at approximately 8.44 x F — 0.7 
so that the detection probability exceeds 95% even in this stage. As discussed in [29], 
although minimum detection probabilities of 95% are chosen at each stage of the hierarchical 
search, the true detection probability is still ~ 95% because of the strong correlation among 
the statistics of the first and second stages. 

For relatively low values of F, the size of each individual rectangle is very large, so that 
fewer templates suffice to cover the parameter space. However, since is also reduced in 
the process, there will be far too many crossings due to noise (false alarms) each of which 
which need to be followed up by the fine bank i.e second stage filters. Thus the second 
stage cost increases. On the other hand, when F is high, the area covered by individual first 
stage tiles will be quite small, and therefore a larger number of tiles are needed to cover the 
parameter space thereby increasing the first stage cost. 

At some optimum value of F = Fopt, the total cost reaches a minimum. This is the 
optimum contour level for laying the first stage templates. The optimum contour level also 
determines corresponding optimum value of the trigger threshold Ci. From Fig 7, we find 
that Fopt ~ 0.8. 

TABLE III: Estimates of the computational cost for the two stage EHS algorithm and gain factors 
over the one step fiat search algorithm. F is the contour level of the ambiguity function used to 
lay the trigger tiles. The speed-up factor Q is defined as the ratio of required online computational 
cost for the flat search to Ag*?*'*. The online flat search cost is estimated to be ~ 3.2 G-Flops. 
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Using the above formulae, we are now in a position to estimate the online computational 
cost of the EHS. Recall that for the mass range considered, the longest chirp signal occurs 
for the smallest component masses mi = m2 = 1 M0. For the lower cut-off of fi = 30 Hz 
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0.84 

FIG. 7: (a) Number of trigger templates plotted against T. (b) Online cost of EHS algorithm 
plotted against T. (c) The average number of crossings in the first stage, (d) The speed-up factor 
G factor defined as ratio of online flat search cost to online total EHS cost is plotted against T. As 
the contour level F is increased, the first stage cost increases because more trigger templates are 
required, while the second stage cost goes down because the false alarm rate goes down. The total 
cost is minimum at an optimum value Fopt ~ 0.8 where the speed-up factor is maximum. 

this signal is about 95 sec. Thus Tpad ~ 512 — 95 = 417 sec. In the description of the flat 
search we mentioned that a total of A/t = 11, 050 templates were required. Furthermore, 
implementing the trigger stage template placement method described above, we find that a 
total of Afl;^^ = 564 templates would be required in this stage. Finally, setting = 6.05 and 
using (4.2 - 4.6) we have Ag^p*^ = 45.1 M-Flops. This should be compared with the cost of 
a flat search given in Table I. For the similar mass limit, we find that the EHS algorithm 
reduces the online computational cost requirement over a flat search by a factor of ~ 68. 
The results for various choices of the parameters a and F are summarised in Table III. 

V. CONCLUSION 

We have investigated the performance of a new implementation of the two-step hierarchi- 
cal search algorithm for detection of gravitational waves from compact inspiraling binaries 
where the flrst stage is coarsely sampled in time, while retaining the earlier flavor of hierar- 
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chy of templates over the mass parameters. The noise power spectral density used in this 
work is that of the initial LIGO. We have described a new method of placing templates in 
the trigger stage of the hierarchical search. We have also clearly outlined methods to set up 
thresholds at the two levels based on probability arguments. 

The calculation of false alarm probabilities and detection probabilities in this paper follow 
from a basic assumption of Gaussianity and wide sense stationarity of the noise in the 
gravitational wave detectors. Also, we have assumed statistical independence of the output 
from the templates. However, we are aware that this may not be the case when the templates 
are placed closely. We have performed Monte Carlo simulations to verify the effect of 
statistical correlations between the templates. Our simulations show that there is very little 
correlation between the processed output of the first stage templates. We beheve that this 
is because of the fact that in this stage the template separations are quite large compared 
to earlier implementations of the hierarchical search. We also find that even when the 
signal parameters are far away from the template centers, there is no appreciable loss in 
detection probability. This result is however obtained for Gaussian noise. We estimate 
that an optimum implementation of the EHS algorithm would reduce the required online 
computing power by a factor of 65 — 70 over a flat search. The shape of the noise power 
spectral density has an important bearing in the calculation of these numbers. 

If most of the compact binaries are likely to have equal companion masses, templates 
centered on the 1] = 1/4 line will pick them up with a higher detection probability. But 
this entails using greater number of first stage templates since a significant fraction of these 
templates would spill over into the unphysical part of the parameter space - below the 
T] — 1/A line. We have adopted a different strategy wherein we have tried to minimize this 
template spillage. Our simulations show that even when the signal parameters lie close to the 
equal mass line, there is no appreciable loss in detection probability. However, to improve 
the probability of detection in the scheme, templates can be centered on the bisector of 
the two boundary lines (a) mi = 1712 and (b) nii = M^^^. This strategy would increase 
the detection probability of the signals in the deemed parameter space without using more 
templates. Note that by placing the tiles above the 77 = 1/4 line, in the first stage, we are 
actually searching more than the mass range of 1.0 — 30.0 M©, because the tiles span a region 
which correspond to actual physical values of the individual masses but outside this range. 
If we include this region in our parameter space, then because the rectangles efficiently tile 
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the total space - by definition - the gain factor could rise above 100. 

A search code for implementing the EHS algorithm is being written using the LIGO 
Algorithm Library (LAL) [14] for the LIGO Data Analysis System (LDAS) [15]. It is to be 
noted that the existing LAL implementation of the template bank generation is based on the 
quadratic approximation of This is not suitable for the trigger stage of EHS which uses F 
(~ 0.82). At these values of F, the quadratic approximation to the ambiguity function is no 
more adequate - which means that it is not feasible to use the metric formahsm [25] to place 
trigger stage templates in the EHS scheme. Therefore, we plan to implement the approach 
of laying out the trigger stage templates using the algorithm described in this paper. The 
fine-bank templates will be layed out using the LAL bank package. 

For each event detected in the first layer of the hierarchy, the approximate time of arrival 
ta is already determined. Thus, only a smaller segment of the data at full sampling rate 
/ifmp needs to be processed in the second stage. This could provide further savings in the 
second stage cost. However, as can be seen from Table 111, most of the computational cost 
is incurred in the trigger stage, so that the improvement in speed-up factor will be small. 

We have not addressed the issue of non-Gaussianity in this paper. However we believe 
that tail features would lead to higher false alarm rates in the trigger stage - necessiating an 
upward revision of the Fopt from present values of ~ 0.82. This would lead to a reduction 
in the computational advantage of the EHS method as more templates would be needed at 
this stage. But perhaps putting another sieve in between the hierarchy layers may alleviate 
the problem. 
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